delphifandomcom-20200223-history
Delphi Suggestions - Compiler
Practical concerns Thinking up some syntax is easy. Implementing it, and make it bullet proof enough to be usable is a totally different matter. The same applies to extensions that are only shorthand: Implementing and stabilising them are often not worth the trouble, and simply adding an IDE template is way easier. I added some very initial and braindead comments here and there which I hope will make people see the downsides of some of the features. The FPC teams uses the following checklist for extensions: extension faq, and most compiler teams have similar criteria. In short the conditions are: # Compability must not be compromised in any way. Existing codebases on at least the Pascal level must keep running. This is often more difficult than most people think. # The extension must have real value. Anything that is only a shorter notation does not apply, unless it is out of compatibility with an existing Pascal/Delphi codebase. Practically it means it must make something possible that can't be done otherwise or be a compability item # The change must fit in with the scope of the project, implementing a Pascal compiler which can have a RAD and generic DB system. This excludes features like inline SQL, and large garbage collected objectframeworks. The latter is maybe less stringent for Delphi, due to its recent .NET reorientation. Also the descriptions of wannabe features are often short and simplistic. Therefore I'd recommend to: * Add a complete explanation of the feature in various usage scenario's * Why it is needed, what does it make possible? * How would you would implement it somewhat efficiently? In Delphi's case: both native and .NET. * Lots of examples of typical use, and tests for possible problem cases and how to fix them. Typical problems are * what to do if typing is less clear (e.g. pass to an overloading method, and there are several candidates) * what to do when passing a symbol to a var parameter. (e.g. properties can't be passed) * what if an exception occurs in the middle of your constructs lifetime? How do you avoid e.g. memory leakage? * if your new extension involves something with automatic memory management: there is a reason why the currently existing automated Delphi datatypes are not nestable. * if related to procedure/method syntax, how do I declare a procedure/method variable of the sort? (procedure of object) * features that add implicit try finally frames (and when heavily used, multiple even) are notoriously slow in loops. In normal cases the programmer can decide if the try except is necessary at this level (e.g. pull it outside of the loop), but for automatic features people don't pay attention to such "implementation details", and assume the compiler developer has done its job, and made it swift. * threading/parallism ** Parallelism is hard to exploit in a non-JIT. This because you don't know up front how many times a loop is executed, and then optimize later runs with reallife profile data. On the other hand this effect must not be overestimated, see this shootout FAQ item to get reallife numbers for Java, and keep in mind that this is about benchmarks that often do nothing else than iterating through the same loop. ** Trying to divide a problem bautomatically/b over multiple threads usually is slower than running them single threaded in sequence. This because locking is relatively expensive (specially in a multiple processor system), and a compiler doesn't know enough about the border conditions of the problem to reduce overly locking and synchronisations. In practice a syntax should also specify how often should be checked if the other thread is already finished. When interested in the subject and related problems, search for OpenMP(i) on Google ** operating on anything ansistring related is slower in parallel than straightforward single threaded execution. Again the locking for the copy on write semantics and the heapmanagers are the problem. See also stringtypes to get a feel what kind of mess you can expect under the hood, if you are implementing a feature. (in this case anything string conversion related) Universal DCUs DCU files do not need to be simple binary files, Codegear should think about using a more generic type of dcu file. Why? Maybe a tag-based system as in TIFF or a binary XML-like system would work best. Define relative "goodness"; what would this improve? This will allow for a dcu that could be used by all delphi versions of the future. In general, the source code is usable by all delphi versions of the future (potentially). Since the compiler is required to use either source or dcu, the advantages of modifying dcus are unclear. It will also allow one dcu to include both 32 and 64 bit as well as .NET versions of the same code. It is expected that the source can/will be targetted towards any of these platforms. Why is modification to DCUs necessary? My opinion: please do NOT use "xml-like" DCU's. Parsing the source file itself will be similar speed to parsing a bloated slow XML file. --L505 Chrome? Implement all the features of Chrome http://www.chromesville.com/language/ or buy it and integrate into Delphi. Implementation status: half of those require a GC or are plain and simply useless without .NET One could integrate them with Delphi.NET but not with Delphi/native. Generics MyList := TGenericList.Create ; Widths := TGenericList.Create; Implementation status: See FPC generics wiki page for a more thorough study into this subject in the _native_ language compiler (.NET JIT has totally different border conditions due to everything being an object). The soon to be released (early summer) FPC 2.2 will ship with generics. And just a note that FPC is free "market research" for code gear at no cost.. and the languages should be kept compatible to keep the entire modern pascal community bigger and stronger. There are theories that proprietary-izing things keeps a tool stronger and more profitable, but in the case of "programmers" it is a bit different since they are destined to share code.. therefore the languages should be kept compatible to grow the language, not just the tool. --L505 Var Initialization var I: Integer = 0; begin instead of var I: Integer; begin I := 0 Implementation status: this syntax is already existing, but the variabele is only initialised on startup. So this would require additional syntax. Enabling "Assignable typed constants" under the compiler options already allows a similar effect, but with declarations made inside "const" blocks. Some Delphians, however, do not like using const parameters as variables. But assignable typed constants are like globals in that they are not thread safe and do not serve the same goals.. right? By the way, freepascal (as market research) supports the initialized local scope vars. '' --L505 ''Another idea is to get rid of the assignable const syntax and use something more appealing such as "pvar" persistant var or something. Using ugly J+ and J- is not so fun, nor does "const" make any sense. --L505 Variable Constraints The ability to define constraints on a variable, if these are violated, an exception is raised. var I:Integer; S:String; ٍsI:String; constraint I > 15; constraint I mod 5=3; constraint length(S)<50; constraint length(S)+I > 8; constraint (sI = IntToStr(I)) or (sI='''); (probably with a more elegant syntax... and working like assertions, only in the $C+ state, right?) Note that this is different from assertions: An assertion only checks a variable at a certain point in the program, a constraint would check it every time it is accessed. Example: var i: integer; constraint i > 15; begin i := 16; Assert(i > 16); // checks now -> OK (SAR - Actually this fails. If i = 16 then I cannot be greater than 16. Oops) i := i - 1; // the constraint is violated, but the assertion above // cannot catch it. Contstraints are somewhat similar to subrange types with range checking on but allow for more complex conditions. Implementation status: when would such constraints be evaluated? Note that always is way to costly. I also think way too much unexpected problems would be created in such cases: (assume I has a constraint on size somehow) i:=i+10; // Constraint triggers here, even though there is an explicit check. if i >20 then i:=10; It would require very careful rewrites of existing codebases to be able to use this feature on a nontrivial scale. In addition, this behaviour is already quite easy to achieve with a simple class wrapper around the target datatype, and using properties to monitor changes. It is unclear what the benefit of changing the compiler over direct implementation would be, except possibly runtime speed. Automatic Create/Free Automatic call of Create/Free for objects created in a function/method by using a specific language keyword : var S: TStringList; begin try ... S := TStringList.Create; S.Add('hi !'); ... finally S.Free; end; end; Are the calls to Create() and Destroy() _really_ useful ? We should be able with the usage of a keyword or a special character to reduce this to : var S: TStringList; auto; begin S.Add('hi !'); end; Obviously, the usage of this syntax would embed try/finally statements in order to free the object even if an exception is thrown inside the function). If the constructor takes parameters, they could be provided in the VAR section : procedure SaveFile(const AFilePath: string); var S: TFileStream(AFilePath); auto; begin ... end; THobo = class(TUsers) FStrings: TStringList; auto; public FUsers: TStringList; auto; end; _I would like a more delphi-like syntax:_ var S: auto TStringList; begin S.Add('hi !'); end; Implementation Status: Such local examples are indeed doable. But they are relatively rare, specially since the compiler must be able to tell absolutely sure that a reference is passed nowhere. IMHO the use would be close to zero. I like the current solution (using an IDE template to pregenerate this) as better, and more transparant. Learn to use them! On the other hand, it could be understood that use of the 'auto' directive implies that the object is for local use only and will be freed automatically once the procedure/function has finished executing. If the object will be passed as a reference then the traditional create method would be used instead of using 'auto'. Alternatively the concept could be extended further e.g. var S: TStringList; auto; retain; ...enabling the local object to be created automatically and also retained. It does have a certain appeal. Many methods defined in classes make use of locally created objects which are then freed immediately at the end of the method. Auto create/free means fewer lines of source code, with less try/finally nesting, making the source code easier to read, developers have less to type in so they can spend their time on other areas, and as these local objects will be freed automatically there's less chance of a memory leak slipping in, so developers can concentrate on other things. Multithreading procedure DoSomethingInAThread (AString:String;AnInteger:Integer); async; procedure DoSomethingElseInAThread (AString:String;AnInteger:Integer); async group 3; procedure DoSomethingElse2InAThread (AString:String;AnInteger:Integer); async group 3; This would execute the procedure in a separate thread.... With all these multi-core CPUs getting to the market, this is getting more and more a priority. global list of threads where threads are registered upon construction and unregistered on destruction. Than this list would have a method GetCurrentThread: TThread, so one can retrieve the thread that is currently running Waiting for Threads to finish WaitForThreads; WaitForThreads(3); Implementation status: Which threads? How long? Why? Another Idea Have a look at Cilk, Cilk is an algorithmic multithreaded language. The philosophy behind Cilk is that a programmer should concentrate on structuring the program to expose parallelism and exploit locality, leaving Cilk's runtime system with the responsibility of scheduling the computation to run efficiently on a given platform. Thus, the Cilk runtime system takes care of details like load balancing, paging, and communication protocols. To give you an idea of how simple it is to write parallel programs using Cilk, here is a sample of what a could be done similarly in Delphi: an implementation of the familiar recursive Fibonacci program in which the recursive calls are executed in parallel: function fib (n:integer): integer; cilk; var x,y:integer; begin if (n < 2) then result:=n else begin x = spawn fib (n-1); y = spawn fib (n-2); sync; result:=x+y; end; end; Notice that Cilk uses a runtime system and Delphi is natively compiled. How would Delphi determine computation scheduling? AsyncCalls have a look at AsyncCalls, It allows easy multithreading in Delphi.... With AsyncCalls you can execute multiple functions at the same time and synchronize them at every point in the function or method that started them. This allows you to execute time consuming code whos result is needed at a later time in a different thread. While the asynchronous function is executed the caller function can do other tasks. The AsyncCalls unit offers a variety of function prototypes to call asynchronous functions. There are functions that can call asynchron functions with one single parameter of the type: TObject, Integer, AnsiString, WideString, IInterface, Extended and Variant. Another function allows you to transfer a user defined value type (record) to the asynchron function where it can be modify. And there are functions that can call asynchron functions with a variable number of arguments. The arguments are specified in an const array of const and are automatically mapped to normal function arguments. Case allow case to handle more than ordinals, and handle more than lists. case s of 'abc': DoSomething; 'xyz': begin DoLots; DoLots; DoLots; end; else DoSomethingElse; end; case Sender of Button1: DoSomething; BitBtn13: DoSomethingElse; CheckBox1: case ListBox1.Items.Count of <10: DoSomething; 10..12: DoSomethingElse; >12,<25: DoThis; >=25: DoThat; end; end; Implementatation notes: The trouble with this is that case then would be a runtime construct, which it isn't now. This then could all kinds of unpredictable results. (e.g. what if more than one condition would be true). This would probably cause each case statement to be forcedly evaluated, causing great slowdown. Note that currently case is roughly equivalent to a series of integer compares, which is transformed into a lookup table if they are close enough to another. For this, all options of the case statement must be known compiletime, and an logical order must be determinable. (if I order a series of integers, and I find that i=4, then the compiler doesn't have to check the other options). The suggestion above is something totally else, more the BASIC case statement that is semantically equivalent to a nested series of IF's. Perhaps the compiler could evaluate each case statement intelligently and generate fast code for ordinal types (the current behaviour for case statements) and slower 'runtime' evaluation code for non-ordinal types. That would give developers the best of both worlds: high-peformance case for ordinal types, while enabling case statements to also handle non-ordinal types. From a source code readability and maintainance point of view, a case statement that handles non-ordinal types is preferable to nesting a series of IF/ELSE IF statements. Map function map(p:procedure; a:array); example: var a:array of Integer; s:Integer; procedure addsum(i:integer); begin s:=s+i; end; s:=0; map(addsum,a); writeln(s); (This dirty version of enclosures has been doable for ages... or am I missing something?) Implementation status: the Turbo Vision foreach already did something like this using procedure variables. However the Borland implementation of procedure variables as a single address is not very suitable for this. You'll need full ISO procedure variable support for this to be useful, so that you can pass both a local procedure and a global procedure to the same map() function. See Apple Pascal's where this is used heavily. FPC status: on the mac compability to do list. (full ISO procvar support) Split,Join, Push and Pop for Arrays; (having array helpers instead would make implementing this very easy) Implementation status: I don't understand what this is about. Probably close to adding methods to records. Since there is only one use, a stack class, I suggest you simply use a stack class. Try Why do we have to do something as ugly as: try try // yada except // yada end; finally // yada end; rather than the elegant try // yada except // yada finally // yada end; (the current syntax more clearly demonstrates that the finally is ALSO around the except() block. If the typing concerns you, define an editor macro) Furthermore, the ugliness can often (in practice) be defeated by restructuring the code such that the try-except and try-finally do not occur in the same procedure, in specific situations where such refactoring makes sense. Regular Expressions var a:array of string; s:string; readln(s); for a in RegExpMatch(s,'/(.*)/') do begin writeln a1; end; RegExpReplace(s,'/(a-z)/',inttostr(ord(RegExpMatch(1))); (I'd prefer full BNF expressions that can also match nested constructs. Regexp can't even validate a simple mathematical expression). Besides, you don't need a language extension for this, just a regexp unit that returns something collection like that for each can handle. Of course it will be doubly dogslow. Once for the foreach with a dynamical construct, once for the regexp engine, which is an INTERPRETER) There are already several (even freely) available regexp units/components. It is unclear what the benefit of the suggested changes are. Language Constructs * Multiline String Constants * Anonymous Arrays * Anonymous functions/procedures (already available, right? although the syntax is somehow odd) Implementation status: multiline string is already possible, do: const x = 'someting ' + ' even more' + ' a line with an ending'+lineending ; // d6+ Pascal * allow the end before an else statement to have a semicolon; :The lack of a semicolon tells the compiler that the following "else" belongs to the current "if" and not to a container "if" statement. If the semicolon was allowed the compiler would have to do more work (read less efficient) to determine which "if" statement the "else" belongs to. Eddie 18:16, 29 October 2006 (UTC) :This is also known as the dangling else problem. The only sane solution would be to always require a block (no single line blocks anymore). In that case one could also omit the begin everywhere, except the procedure/method starting one. However if you do this, you already halfway turning Pascal into Modula2. 88.159.74.100 15:44, 17 May 2007 (UTC) * a better syntax for the with statement, see qc for various suggestions. :half of them are braindead. Class Helpers Even though they started as a hack, having multiple class helpers for a class would be nice to have, also imagine class helpers that could be registered as components, they would only apply when put on a form (as a non visual component). Allow to enable or disable a specific class helper. TNiceDBGrid=class helper for TDBGrid; TColoredDBGrid=class helper for TDBGrid; DBGrid1.ClassHelperTColoredDBGrid.Enabled:=false; Implementation status: they still are a hack :-) If you forget to USES the related units, your class is unenhanced, which might be dangerous. Use with care and sparingly. I guess the Delphi team would be more interested in cutting down the risks of having to add this, than further expand it. Interface Helpers Same idea, why can't helpers be also for interfaces? Implementation status: because you would have to include all helpers for all objects that implement the interface, pretty much giving up the idea that you don't have to know what the interface is, to be able to use it. Arrays Allow Array Keys to be other than integers. This includes an optimzed string-based hash. as well as the ability to use any data type as an index .... var Captions: ArrayTComponent of String; C:TComponent; begin for C in Components do CaptionsC:=C.Caption; //Do Some Stuff result:=join(' ',Captions); end; See the note about array helpers, and the new syntax for records, should help here? Implementation status: then it is not an array, but a map. In the case of your example, how do you avoid overly long search times to find the right string for the given component C ? Distributed this should run in a cluster (MPI/PVM/????) as in procedure DoSomethingInACluster (AString:String;AnInteger:Integer); async; distributed; or even: var a:Array of String; function MapSomething(DocID:Array of Integer):Array of String; function ReduceSomething(Maps:Array of String):Array of String; a:=ReduceSomething(MapSomething(1..1000000)); Inline Functions and Procedures Have the possibility to perform "inline of functions" (as C/C++/Perl). Status: Already implemented in D2005+ and FPC 2.0+. Category:Delphi Suggestions Sparse sets (sparse sets are sets that span a large range (say up to 2 billion) of values, but only allocate space depending on how much elements are in them) * Multiplexed on a dynarray implementation, basically an array of byte/word/integer (depending on the compiletime maximal range) * elements that are "on" are simply added to the array in a sorted manner. * IN implementable somewhat efficiently by binsearch. * optionally one could try to implement ranges by reserving the high bit as a marker for a two-element range x-y, like e.g. UTF8 does. To avoid problems with binsearching, probably both parts of the range would have the high byte set, or some rule that ranges may only start at an even position. The main reason would be e.g. sets of char for unicode chars. Static sets would become way to big. Unit Alias uses htmparser := JohnsFastHtmlParser; Or uses JohnsFastHtmlParser as htmparser; ComponentPascal (blackbox, oberon) has unit aliases. It is not so apparent as why this is useful to many programmers.. It can solve many problems with the turbopascal "file as our namespace" problem. I have talked about this on the newsgroups. I will post more info about here when I get a chance, even maybe copy and pasting some of my previous comments about it into this article. --L505 Duplicate Object Instance Objective: Provide an easy mechanism to duplicate an instance of any object, so that the duplicate has the exact same state as the original object. Problem: Defining a new object that can duplicate an instance of itself together with its current state can be time consuming and prone to error. It usually involves overriding the Assign method, then writing code to copy values across from each variable representing some aspect of the assigned object's current state - and it is easy to miss a variable by mistake, or forget to update the Assign method when a new variable is added to the class at a later date. It would be great if variable contents were copied across automatically from the source to the target object during the assignment. Possible Solution: If variable declarations could be marked in some way, indicating that their contents are to be copied across when assigning, then the compiler could generate the necessary code itself, relieving the developer of a tedious task. Objects that can duplicate instances of themselves would be simple to define. type TBigBrownParcel=class(TObject) Width: integer; save; Height: integer; save; DeliveryRoute: TStringList; save; BoxContents: TStringList; save; cloneDuringAssign; OpenedToday: boolean; The save directive lets the compiler know that specific variables are an integral part of the object, as opposed to a temporary object state variable, and that their contents/values need to be preserved (copied/persisted) whenever the object is Assigned to another object. This directive frees the developer from having to override the class Assign method to handle the transfer of variables across from the assigned object during an assign operation, since all variable/property declarations marked with the save directive are now copied across automatically. Variables referencing other objects, such as the DeliveryRoute variable above which points to a TStringlist, are copied 'as is' (i.e. just the reference) - unless the clone directive is also supplied, in which case a unique copy of the referenced TStringlist is created during the assignment. The save directive also makes it easy to save any object's current state out to a stream without writing any code (all the variables marked with save are included in the stream by the base SaveToStream method). Incidentally it could be useful to enhance SaveToStream to support a variety of output formats.